Sparse supermatrices for phylogenetic inference: taxonomy, alignment, rogue taxa, and the phylogeny of living turtles.
نویسندگان
چکیده
As phylogenetic data sets grow in size and number, objective methods to summarize this information are becoming increasingly important. Supermatrices can combine existing data directly and in principle provide effective syntheses of phylogenetic information that may reveal new relationships. However, several serious difficulties exist in the construction of large supermatrices that must be overcome before these approaches will enjoy broad utility. We present analyses that examine the performance of sparse supermatrices constructed from large sequence databases for the reconstruction of species-level phylogenies. We develop a largely automated informatics pipeline that allows for the construction of sparse supermatrices from GenBank data. In doing so, we develop strategies for alleviating some of the outstanding impediments to accurate phylogenetic inference using these approaches. These include taxonomic standardization, automated alignment, and the identification of rogue taxa. We use turtles as an exemplar clade and present a well-supported species-level phylogeny for two-thirds of all turtle species based on a approximately 50 kb supermatrix consisting of 93% missing data. Finally, we discuss some of the remaining pitfalls and concerns associated with supermatrix analyses, provide comparisons to supertree approaches, and suggest areas for future research.
منابع مشابه
Phylogenetic supermatrix analysis of GenBank sequences from 2228 papilionoid legumes.
A comprehensive phylogeny of papilionoid legumes was inferred from sequences of 2228 taxa in GenBank release 147. A semiautomated analysis pipeline was constructed to download, parse, assemble, align, combine, and build trees from a pool of 11,881 sequences. Initial steps included all-against-all BLAST similarity searches coupled with assembly, using a novel strategy for building length-homogen...
متن کاملA large phylogeny of turtles (Testudines) using molecular data
Turtles (Testudines) form a monophyletic group with a highly distinctive body plan. The taxonomy and phylogeny of turtles are still under discussion, at least for some clades. Whereas in most previous studies, only a few species or genera were considered, we here use an extensive compilation of DNA sequences from nuclear and mitochondrial genes for more than two thirds of the total number of tu...
متن کاملMolecular phylogeny of clade III nematodes reveals multiple origins of tissue parasitism.
Molecular phylogenetic analyses of 113 taxa representing Ascaridida, Rhigonematida, Spirurida and Oxyurida were used to infer a more comprehensive phylogenetic hypothesis for representatives of 'clade III'. The posterior probability of multiple alignment sites was used to exclude or weight characters, yielding datasets that were analysed using maximum parsimony, likelihood, and Bayesian inferen...
متن کاملTerrace Aware Phylogenomic Inference from Supermatrices
One approach in phylogenomics to infer the tree of life is based on concatenated multiple sequence alignments from many genes. Unfortunately, the resulting so-called supermatrix is usually sparse, that is, not every gene sequence is available for all species in the supermatrix. Due to the missing sequence information a phylogenetic inference, assuming that each gene evolves with its own substit...
متن کاملMolecular phylogeny of Scutellaria (Lamiaceae; Scutellarioideae) in Iranian highlands inferred from nrITS and trnL-F sequences
Scutellaria with about 360 species is one of the largest genera of Lamiaceae. The Iranian highlands accommodate about 40 Scutellaria spp., and is considered as one of the main centers of diversity of the genus. Here, we present a phylogenetic study for 44 species of Scutellaria especially from Iranian highlands, representing major subgeneric taxa, based on nuclear rib...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Systematic biology
دوره 59 1 شماره
صفحات -
تاریخ انتشار 2010